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ABSTRACT 



An automatic pattern recognition method has a short 
processing time, can be applied to nonlinear separation 
problems, and can perform similarity calculations. The 
method: Divides a plurality of sample data of known 
categories into a plurality of classes; When the sample 
data in a divided class is not all in the same category, 
repeats dividing the sample data into subclasses until 
sample data in a subclass has only one category; Ex- 
presses the relationship between classes and subclasses 
in a tree-structure representation and determines the 
standard pattern for each class and subclass from the 
sample data contained there; and Checks which of the 
tree-structured classes input data of unknown category 
is nearest, by calculating the distance to the standard 
pattern of each class, and then, when the class has sub- 
classes, performs a similar check until the lowest-level 
subclass is reached to determine the subclass the input 
data is closest to. The category of the lowest-level sub- 
class is taken as the category of the input data. 

6 Claims, 12 Drawing Sheets 
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FIG. 3 
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FIG. 4 
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FIG. 5 
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FIG. 6 
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FIG. 7 
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est standard pattern requires the input data to be com- 

AUTOMATIC CLUSTERING METHOD pared with each of the standard patterns. Therefore, the 

processing time is generally very large. Further, since 

BACKGROUND OF THE INVENTION this method depends upon the distance of the input data 

The present invention relates to automatic data clus- ' with respect to the standard patterns for checking the 

tering, which is particularly useful in pattern recogni- similarity, the method often cannot be applied to non- 

tion, for example in speech pattern recognition or image linear separation problems where multi-dimensional 

pattern recognition or text/character pattern recogni- sample data are distributed such that one or more sam- 

tion. pie data points are contained in other data, for example 

Pattern recognition, and particularly data clustering where the input data have multiple categories, 
involves a large volume of sample data for learning and (2) The layer-type neural network determines a sepa- 
a determination of a category of input data for pattern ration surface for categorizing, when solving problems 
recognition, and more particulariy the identifying and including non-linear problems. That is, the separating 
classifying of data representing a speech pattern or an surface may be linear or it may be non-linear. For exam- 
image pattern, for example. »5 ^^^^ ^ two-dimensional problem in which data group 

In pattern recognition, there are generally two repre- x and data group B are to be separated, as shown in 

sentative methods that employ learning with high vol- ^^^h figure is useful for analyzing the prior 

umes of known sample data having plural categories for ^^^^ analysis is part of the invention, a separating 

determining a region that each known categoij^ occu- ^^^^ Lq is obtained as a rcsuh of learning. Contour 

pies m a pattern space and then detennmmg the cate- 20 ^^^^ ^^^^ ^ dotted 

gory of unknown data according to die region of the pj^. jj^ ^oduced by the neural network 

unknown data. The two methods may be represented by symmetrical w^th respect to the separating sur- 

the Artificial Intelligence Handbook, published by rxu i ^j^^ ^ j 

Ohm, "Pattern MatcWng", page 324, compiled by the ^^^^ Henc^ unknown mput data of, say, pomt a and 

Artificial Intelligence Association; and Chapter 8 of 25 PO^t b have the same output values in the layer-type 

Parallel Distributed Processing, entitled "Learning Inter- 1^^"^^^ n^Xv^oT)^ and they are both therefore classified 

nal Representations by Error Propagation", by D. E. ^roup A on one side of the separatmg surface 

Rumelhart and others, compiled by the Institute for ^o. The similarity of the input data with the known 

Cognitive Science, University of California, San Diego. sample data is usually defined as a distance from a distn- 

Specifically. the two methods that employ the above 30 bution center of the sample data. The distribution center 

are: niay also be defined as a standard pattern or code book. 

(1) Pattern matching by preparing the standard pat- This means that data at point b must be significantly 
tern for each category and taking as the category for lower in similarity than data at point a. Despite this fact, 
unknown input data a category whose standard pattern the layered neural network output values for these 
is nearest to the input data. There are some pattern 35 points a and b are the same. This reveals the inability of 
matching methods that do not prepare standard pat- the layered neural network method to correctly calcu- 
terns, but instead they use sample data of known catego- late similarity. This also indicates that while the layer- 
ries and then take the category of the sample data which type neural network method can be applied to simple 
are closest to the input data to be the category of the classifications, it is not suitably applied to classifications 
unknown input data, which is known as the nearest 40 that depend upon similarity. 

neighbor method. In FIG. 11, the separating surface LqIs a flat plane, 

(2) A layer-type neural network arranges non-linear linear in the figure. The layer-type neural network may 
units called neurons in layers and learns transformation also employ non-linear planes or Imes as separating 
rules between sample data and the categories as weights surfaces, but the contour Unes, dotted lines in FIG. 11, 
between neurons. A commonly used learning method is 45 ^^^j^ ^e spaced from the non-linear lines in a manner 
back proj^gation based upon the steepest descem ^^^^^^ ^ topographical map so that two 
method. Output data produced when unknown input j^^^ ^^^^^^ ^.^^^ to a and b in FIG. 11, could still 
data are giv«i to Ae neural network are taken as the on the same contour line to be classified in the same 
category of the unknown input data. ^^^^^^ ^^^^^^ ^^^^ ^^^^^^ j^.^^ 

SUMMARY OF THE INVENTION very far apart, in the same manner that the points a and 

It is an object of the present invention to analyze the J,^l^^?^^^f^ ^^^^^"^ f " 

prior an techniques for pattern recognition (especially non-hnear type of layer-type neural 

the clustering technique), and to identify problems and "^^^^^^ employmg a non-hnear separating surface 
the sources of the problems, so that such problems may 55 ^^^^^^ ^^^i, problem as that discussed with 

be solved or overcome. respect to FIG. 11 employmg a Imear separatmg surface 

In classifying or identifying data, one should consider ^ . , . , i. 
not only the decision accuracy, but also the time taken It is an object of the present mvention to provide an 
for obtaining such decisions. As to the decision accu- automatic classification or clustering method for mput 
racy in particular, it is essential in practice to have a 60 «>ata, particularly for pattern recognition, which solves 
function that can check the similarity of a. pattern and above mentioned problems experienced with con- 

reject it when the pattern is other than that of an ex- ventional methods, has a ^ort processing time, and can 
pected category. Problems with conventional methods he applied to non-linear separation problems while still 
in this respect are described below as they relate to the accurately determining similarity, 
above mentioned two prior art methods. 65 To solve the above mentioned problems recognized 

(I) In the pattern matching, where the similarity is and analyzed with respect to the prior art according to 
determined by calculating distances of the input pattern the present invention, the present invention employs the 
to the standard patterns, the determination of the near- following steps: 
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(1) Dividing sample data of a plurality of known FIG. 3 is an example of a tree structure and standard 
categories into a plurality of classes or regions during pattern generated according to the learning process of 
learning; FIG. 1 and stored in a memory, for example, a com- 

(2) When all the sample data in a divided class do not puter memory; 

have the same category, further dividing the data into 5 FIG. 4 is a flow chart for generating subclasses as 

subclasses, and recursively repeating the division for performed by step 2 of FIG. 1, for detennining the 

any subclass that does not have only one category for subclass of the sample data and for determining stan- 

the same sample data until all undivided classes and dard patterns contained in a subclass; 

subclasses have only one category of sample data within PI<J- 5 shows a distribution of sample data in a two- 

them, that is until each undivided class or subclass con- dimensional space defined by axes of two physical quan- 

tains only one category of sample data, as a further part ^^i"; 

of learning; P^^- ^ shows an outline of principles of diviomg 

(3) As a further part of the Icammg. expressing the actual sample data of FIG. 5 into subclasses; 
relationship between the classes and subclasses in a tree F^G- 7 shows the tree structure showmg the relation- 
structure representation with the tree division corre- »5 ship among subclasses, each contammg actual sample 
sponding to the class/subclass division, and determining , » , , . 

the standard pattern from the sample data contained in ^^^^ «A shows the result of identifymg the catcgojry 

each class and subclass; and '"P»* «^«8ory is unknown by usmg the 

(4) After the above mentioned learning, providing standard patterns obtained; 

input data of unknown category, and deciding, from a ^^^^^ ^fj!^^^' of categon^ identifi^to^^ 

distance of input data of an unknown category from the wiA Ae condition c^^^^ (sunilan y) softened; 

standard pattLi of each class, which of the tree struc , ^IG. 9 shows a distnbuUon of sample data in a pa - 

tured clasL the input data a;e nearest; then when the t'll^^JJn .^I^ 

J -J J I • J- J J • . u 1 • ^ ^ sion between subclasses and some stanaard pattern; 

decided class is divided mto subclassesin the tree struc- lo is a tree structure of subclasses and sample 

ture. determining to which of he subclasses the input Mongmg to each subclass; 

data are closest on the basis of the distance to the slan- p,Q a diagram illustrating the present invention 

dard pattern of each subclass; and repeating the above ^ gj^^ ^ conventional layer-type 

process for each following subclass along the tree struc- ^^^^^^ network- and 

ture until a lowest-level that is terminal or end-leaf or is an example of a system configuration that 

undivided, subclass is reached m the tree structure and embodies the present invention, 
taking the category of such lowest-level subclass as the 

recognized category of the input data. DETAILED DESCRIPTION OF THE 

Because of the steps numbered (I) and (2), even when PREFERRED EMBODIMENT 

the sample data is in an inclusive state, since the so 35 As used herein, the term "standard pattern" is cquiva- 

called non-linear separation problem can be converted jgn^ ^he known term '^code book", 

into a plurality of partial linear problems by dividing the ^ preferred embodiment of the present invention will 

data, the method of this invention can be applied to described, particulariy starting with FIGS. 1-8. 

non-linear separation problems and can also calculate piGS. 5, 6 and 7 are useful in discussing the principle of 

similarity at high accuracy and high speed. As to the 40 the present invention for generating a tree structure for 

step (3), there is provided a tree structure for data classi- data clustering and standard patterns, according to the 

fication and standard patterns associated with each learning mode. Data clustering is a technique that is 

node of the tree structure. The step (4) performs checks particularly useful in pattern recognition, and pattern 

according to the tree structure obtained by the step (3). recognition may be classified into various types, for 

Because of the above features, the matching or compari- 45 example image pattern recognition or speech pattern 

son of the unknown input data does not have to be done recognition or text pattern recognition. Generally in a 

for all standard patterns, but only the standard patterns learning mode, sample data of known category or 

along one path of the tree structure with no branching, known categories are used to teach a system and there- 

which speeds up the processing as comparing to the after, based upon the results of teaching/learning in a 

prior art methods. The similarity of unknown input data 50 recognition mode» data may be input of unknown cate- 

can be determined by calculating the distance to the gory so that the category of the unknown input data 

standard patterns only at each branch of the tree struc- may be determined. 

ture along the path taken, and calculations for other FIG. 5 shows an example of the distribution of sam- 

branches that are not along the path taken do not have pie data in a two-dimensional space. As used herein, 

to be made. 55 **sample data" refers to data used in the learning mode, 

DTjTctr T-itrcr-DTTyrfriKT nir tuc no AWfMo "input data" refers to data to be categorized in a 

BRIEF DESCRIPTION OF THE DRAWING recognition mode according to the results of the learn- 

Further objects, features and advantages of the pres- ing mode. A learning mode is the same as a teaching 

ent invention will become more clear from the follow- mode. 

ing detailed description of a preferred embodiment, 60 The two dimensions of the two-dimensional space are 

shown in the accompanying drawings wherein: . respectively plotted on the X axis as a physical quantity 

FIG. 1 is a flow chart for the learning process that 1, for example length as a first dimension. The second 

uses sample data of a known category, that is a flow dimension is plotted along the Y axis as a physical quan- 

chart for generating the tree structure for data classifi- tity 2, for example weight. These dimensions may be of 

cation and standard patterns; 65 any type, including the mentioned physical quantities, 

FIG. 2 is a flow chart for identifying a category of in practice. Multi-dimensional data having dimensions 

input data of unknown category, according to the tree far greater than two may be employed, as will be dis- 

structure obtained from FIG. 1; cussed later, but for the purpose of understanding the 



05/22/2004, EAST Version: 1.4.1 



5,329,596 

5 6 

basic principle ofthe present invention, it is sufficient to other methods of dividing or clustering may be em- 
consider only two-dimensional data. ployed according to the present invention. By way of 

The sample data to be used in this learning mode have example, the class C3 has been shown as being subdi- 

two known categories, represented m the drawing by a vided into subclasses C31, C32, C33 and C34. When the 

first category or category 1 indicated by an open or 5 sample data in a subclass are not all of the same cate- 

white-filled circle, and category 2 indicated by a closed gory, the subclass is divided again. As seen, subclass 

or solid black lUled circle. For the drawing, it is seen C32 contains both category 1 data and category 2 data, 

that the category 1 data have a genera] locus of a spiral Therefore, subclass C32 is divided into further sub- 

and category 2 data have the general locus of a spiral classes C321 and C322, and such division continues until 

within the spiral of the category 1 data, so that a separa- 10 each undivided class or subclass has only data of one 

tion Ime drawn between the two spirals would be a category. Similarly, subclasses C411, C412 and C511, 

curved or two-dunensional line. Since the separation C512» C513, C514 are subdivisions of subclasses C41 

line is curved, it is non-linear and therefore the separa- and C51» respectively. Therefore, it is seen that the 

tion problem presented by FIG, 5 is considered a non- symbols CI, C2, C3, . ; . C12, C22, , . . C514 represent 

linear separation problem. 15 the names of classes and subclasses. Terminal or undi- 

The sample data of FIG. 5 are shown in the center of vided classes and subclasses are C1I-C15, C21-C24, 

FIG. 6, and FIG. 6 is a schematic representation of the C31, C321, C322, C33, C34, C411, C412. C42, C43, C44, 

learning mode, wherein the sample data are clustered C45, C511-C514, C53, C53, C54, CSS. Thus, category 1 

into classes and subclasses according to category. data are only in subclasses Cll, C14, CIS, C22, C24, 

First, the sample data of FIG. 5 arc classified into a 20 €321, C31, C44, 045, C412. 054, 055, C511, 0512, 

plurality of classes, namely CI, 02, 03, 04, 05 as shown 0513, and category 2 data are in the remaining undi- 

in FIG. 6. The number of classes may be arbitrarily set, vided subclasses. 

and here by way of example the number of classes for According to the above, it is seen that a standard 
the initial division is set at five. This division is obtained pattern, represented by the large black dot, is deter- 
as follows: 25 mined for each class and subclass from the sample data 

The geometric center of the sample data is deter- contained therein. The standard pattern is the center of 

mined and at the geometric center, five closely spaced sample data distribution of each class or subclass, 

points, for example in a small ring, are considered. With The divisional relationship among the classes 01-C5 

respect to each of these five points, each of the closely and subclasses 011-055 is expressed in a tree-structure 

adjacent circles representing the closest of the sample 30 representation as shown in FIG. 7. In FIGS. 5 and 6 the 

datahavetheir distances measured to each of the points, category 1 data are represented by white circles, the 

and these closely adjacent sample data circles closest to category 2 data are represented by black circles and the 

a point are grouped to belong to the respective class of standard patterns are represented by large black circles, 

the closest point, so that only the closely adjacent of the FIG. 1 is a flow chart for the learning process steps in 

sample data are classified into respective ones of the five 35 the learning mode of the present invention, which uses 

classes according to their closeness to the five points sample data of known categories, for example the data 

representing the five classes. The points of the classes discussed above with respect to FIGS, 5, 6 and 7, By 

will now shift outward as they will now represent the using the sample data of known categories, the process 

standard pattern or code book or geometric center of steps according to the flow chart of FIG. 1 will gener- 

the now included closely adjacent data circles. 40 ate the tree structure of FIG. 7 for data classification or 

Next, moving outward, the next, closest adjacent clustering into standard patterns, 

circles of the sample data are classified by again calcu- The learning proceeds with all sample data being 

lating the distance from each of them to each of the new used as the argument for the process. Step 1 divides all 

class standard patterns, the standard patterns are revised sample data into a plurality of classes, for example the 

to take in consideration the newly added data for the 45 classes 01-C5 of FIG. 6 as described previously. Next, 

class, and these steps are repeated until all of the data step 1 decides if each class has only one category of 

circles are classified into one of the five classes. In this data. 

manner, the classes take on virtual boundaries as shown If the answer to step 1 is no, processing proceeds to 

in HG. 6. step 2. In step 2, a class that does not contain only one 

When one of these classes does not have only a single SO category of data is subdivided into subclasses and for 

category of data within it, it is further subdivided in the each such subclass, the standard pattern for the subclass 

manner mentioned above into a plurality of subclasses. is determined. For example, in step 2, class C3 of FIG. 

For example, class 1 is seen to have data of both cate- 6 is subdivided into subclasses 031. 032, 033 and 034, 

gory 1 and category 2 within its boundary. Therefore, and each subclass has its standard pattern determined, 

the sample data of class 1 is further divided or clustered 55 Step 3 will register the number of subclasses and the 

into subclasses Oil, 012, 013, 014, 015 by the same standard pattern of each subclass in a storage pointer 

clustering method that was used for determining the group. Also, step 3 will register the standard patterns in 

original classes 01-05 It is seen that the subclasses Oil, memory locations specified by the pointers to thereby 

012, 013, 014, 015 each contained only one category generate the tree structure shown in FIG. 7. 

of data, so that no further subdivision is needed. For 60 Step 4 will make a recursive call to the routine, that 

each class and subclass shown, the final standard pattern is proceed to step 1 with the sample data contained in 

is shown as a large black dot, which as mentioned repre- the subclass taken as an argument. Thereby, on the 

sents the geometric center of the data within the respec- recursive call, for example, subclass 032 may be further 

tive class or subclass. In the manner mentioned above, subdivided into subclasses 0321 and 0322 by step 2. 

the other classes 02, 03, 04, 05 are further subdivided 65 After the recursive call of step 4, step 5 checks to see 

into subclasses. if the recursive calls made to date have finished for all of 

While one method of dividing or clustering data has the subclasses, that is if after all of the recursive calls, 

been set forth above according to the present invention, the undivided subclasses and undivided classes each 
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contain only data of one category. If the answer to the subclasses C31, C32, C33 or C34 that has the shortest 

question of step 5 is yes, the learning mode is ended, and calculated distance. 

ifthe answer is no, the processing proceeds to step 4 for Next in step 9, it is checked to see if the distance 

another recursive call. calculation has been completed for all the subclasses. 

Step 5 is a recursive call for each subclass whether or 5 For example, if the distance calculation has been made 

not it contains only one category. Thereby, step 5 re- only with respect to subclasses C31, G32 and C33, but 

quires that each class and subclass is subject to a recur- not with respect to C34, the answer is no and processing 

sive or original processing through steps 1, 2 and 3. returns to step 8 for the distance calculation between 

After a sufficient number of recursive calls, step 1 will the input data and the standard pattern of subclass C33. 

produce an answer yes that each undivided subclass and 10 When the answer to step 9 is yes, processing proceeds 

undivided class have only one category of data therein, to step 10. In step 10, a check is made to sec if the mini- 

and the flow would proceed to step 6. mum of the calculated distances before step 8 is less than 

In Step 6, the standard patterns and the category a predetermined value. For example, the predetermined 

names are registered in memory as a part of the tree value may be set beforehand according to the degree of 

structure, to complete the tree structure, for example 15 accuracy desired or according to other Standards. In the 

shown in FIG. 7. example, the four calculated distances corresponding 

In the flow chart of FIG. 1. step 2 wiD be dcscnbed respectively to subclasses C31-C34 are compared to the 

later in greater detail with respect to FIG. 4. predetermined value. If the answer to step ID is no. 

The procedure shown m the flow chart of FIG. 1 processing proceeds to step 10' which will reject the 

repeatedly subdivides a plur^ity of sample data of 20 data when the distances between the mput data and the 

known categones until all the data m each lowest level standard patterns are too large and thereby indicate no 

class/subclass are of the same category, while at the similarity. If the answer to step 10 is yes, step 11 wiU 

same tune generatmg the tree struaure for data classifi- ^ recursive call to this routine for the subclass that 

cation ox clustenng along with generaimg and stormy ^ ^^^^^^ ^^^^^ ^.^j^ minimum distance 

«andard pattern for each branch of the tree structur^ 25 ^^^j^^^d in step 8 In the example, if it is assumed that 

Funher, the category oi eacn lowest level class ana ^^^^^^^^ ^^^^^ 

a subclass had the minimum 

lowest level subcl^ is stored. An example of the result ^^^^^^^ ^^^.^^^^^ ^^^^^^^ ^^^^^ ^^^^ ^j^^ 

IS shown ^^^J^^'J- «f « u«r«^< irAA ^tr^.n ^^P^^ ^ata bdouged to subclass C32, the recursive call 

FIG. 3 shows another example of a learned tree struc- ^J^^^^^ n r^i-* r.««--.«f 

*u * J J 4,4, \. ^^^4.^ *j :« to step 7 would then make subclass C3Z the current 

ture with the standard patterns generated and stored m 30 , , *^ - , - * * o u i i • *i. ♦ 
^ ^^^^ «e /««..u />f -.*«on*i«« «ti» loarmno subclass for stcp 7 50 that step 8 would calculate the two 

The tree structure of FIG, 7 is in the form of a graphic 'nput data, to be followed by steps 9^^ 10 and 11 that 

representation of the classes and subclasses with the ^^^^^^ff ° P^^^'f^ '""-^ 

categories identified for the lowest level of class or 35 C«l or subclass C322, whichever had the mmi^ 

subclass. While the corresponding data in computer mum calculated distance that was withm the prcdeter- 

memory are represented in the tree structure of FIG. 3. """^d value of step 10. At this point, if it is assumed that 

the left hand block of FIG. 3 shows a subclass number subclass C322 was a subject of the recursive call and 

and pointers that would be registered in accordance therefore the current subclass m step 7, it is se^ that 

with step 3 of FIG. 1, for example, and the rightmost 40 step 7 would determine that the current subclass, 

column of blocks would represent the registering of namely subclass C322, is the lowest level of the tree 

standard patterns with category names for the lowest structure and therefore processing would proceed to 

level classes and lowest level subclasses in accordance step 12. 

with step 6 of FIG 1 ^tep 12 calculates the distance between the mput data 

The learning mode has now ended. Instead of an 45 and the standard pattern of the lowest level subclass, 

automatic learning mode as described above, the tree which m the example would mean the step 12 could 

structure represented in FIG. 3 may be placed in mem- calculate the distance between the input data and the 

ory after being manually generated or produced in any standard pattern (large dot in FIG. 6) of the lowest level 

other manner. In any event, after the tree structure, subclass C322. 

such as that shown in FIG. 3, is provided in the mem- 50 Step 13 checks to see if the distance calculated in step 

ory, the recognition mode may be entered, which rec- 12 is less than a predetermined value, and if the distance 

ognition mode will be discussed in detail with respect to is not less than the predetermined value, the data will be 

the flow chart of FIG. 2. rejected by passing to step 10*, that is when the distance 

Step 7 determines ifthe current class/current subclass between the input data and the standard pattern is sufTi- 

is at the lowest level of the tree structure. Ifthe answer 55 ciently large to indicate no similarity. If the distance is 

to step 7 is no, processing proceeds to step 8 for calcula- less than the predetermined value to produce an answer 

tion of the distances between the standard patterns of of yes in step 13, that is if similarity exists, processing 

the next layer of branching from the current class/cur- proceeds to step 14. 

rent subclass and the input data. That is, the similarity is In step 14, there is a registration in the memory of the 
determined between the input data and each of the next 60 category name of the lowest level subclass, and this 
layer of subclasses of the current class or current sub- category name of the lowest level subclass is thereby 
class. For example if the current class is C3 in FIG. 6: the output of the recognition mode, which is an identifi- 
from the tree structure organization ofdata as shown in cation or recognition of the input data. That is, it is 
FIG. 3, the standard patterns for each of the subclasses determined by the recognition mode that the input data 
C31, C32, C33, C34 are obtained; the distance from 65 of unknown category is the category registered in step 
each of these four standard patterns to the input data is 14. In the example indicated above, it is seen that sub- 
calculated to produce four distance calculations; and class C322 was registered in step 6 of FIG. 1 with cate- 
the input data are determined to belong to the one of the gory 2, so that the recognition of the input data as be- 
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longing to subclass 322, for the example given, has de- („Xi, „X2):sample data value belonging to subclass n. 

teraiined that the inpnt data belongs to category 2. [Equation 2] 

With the recognition procedure shown in the FIG. 2 BXXt+i)-nXKt)-o(l/Ln)XisiXi((flX/-ixd 
flow chart, the input data of unluiown category are 

compared repetitively with the standard pattern of each 5 where 

subclass along a particular path of the tree classification, constant 

to determine the category of the input data. When the X(t): standard pattern at time t. 

input data differ greatly from the learned sample pat- Step 18 checks if the error E = |E(t-l)-E(t)| is 

terns, the input data are rejected. It is now seen that the snialler than a specific set value. E(t.l) represents a 

comparison of standard patterns flows along only a 10 previous error and E(t) represents a current error, 

single path, which at each node of the uee takes only Step 19 moves the standard patterns according to 

one branch. Thereby, the speed of the processing is [Equation 2] set forth above when the error E is larger 

correspondingly great in that it is not necessary to com- ^^an the predetermined value, and then returns to step 

pare the input data with all of the standard patterns of 1^- 'Hie diflerences between the standard patterns aiid 

tree. 1^ sample data belonging to them are summed up in 

Steps 8 and 12 of FIG. 2 calculate distances, and these each dimension. The standard patterns are moved m a 

distances may be calculated between the input data and direction represented by the total difference, 

the standard patterns as a Euclidean distance or a Then the distance between each sample data point 

Mahalanobis distance using variance co-variance matri- and each standard pattern is calculated again to deter- 

c^ of sample data belonging to the subclass. ™^"e the standard pattern nearest each sample data. 

FIG. 8A shows the result of recognizing or identify- Then sample data are made to belong to the associated 

ing the category of input data whose category is un- nearest standard pattern. The moving of each standard 

known, according the recognition processing flow of pattern is repeated until the total difference between the 

FIG. 2, by using the tree structure and the standard standard pattern and the sample data belonging to it is 

patterns of FIG. 3 obtained as a result of performing the 25 smaller than a certain value in each dimension, 

learning processing sequence of FIG. 1. Small white When the error E becomes smaller than the specific 

and black circles represent sample data of two different set value, each standard pattern at this time point is 

categories as previously discussed with respect to FIG. taken to be the final standard pattern and each sample 

5 & 6. Larger white and black circles represent input <lata is considered to belong to the subclass in which the 

data of unknown categories. As is evident from FIG. 30 nearest standard pattern resides. Then this process of 

8A, coloring of circles (into black and white circles) step 2 is terminated. 

distinguished the identified categories, with input data In FIG. 6, relatively large black circles, which repre- 

of large white circles correctly classified into the same sent a standard pattern for each subclass, and sample 

category as the small, white circle sample data and with data enclosed by a solid line which belong to the same 

input data of large black circles correctly classified into 35 subclass of each standard pattern, are determined by 

the same category as the small, black circle sample data. tJ"s processmg step 2 of FIG. 4. 

The blank area in the figure indicates the rejected area. While the above embodiment deals with a two-di- 

In other words, input data more than a certain distance mensional separation problem, this invention can also be 

remote from the sample data is rejected as a result of the applied to separation problems in an n-dimensional 

distance comparison (similarity) decision. ^ space, where n is an integer greater than two. In that 

FIG. 8B shows the result of category identification case, only the number of dimensions in the Equation 1 

with the distance (similarity) decision requirements need be increased. 

slightly loosened or softened (i.e., the predetermined FIGS. 9 and 10 illustrate the application of this inven- 

reference value for the distance decision is increased). tion to the classification of facial images. The sample 

FIG. 4 shows an example procedure for generating ^5 data represent a facial gray image, which has a 16-pixel- 

subclasses as carried out by learning step 2 of FIG. 1 by;16-pixel size. In other words, this represents a sepa- 

and determining sample data and standard patterns con- ra^on problem in a 256(16 x]6)-dimensional pattern 

tained in the generated subclasses. space, for example. 

Step 15 positions a plurality of standard subclass pat- FIG- ^ shows the distribution of sample data (white 

terns (n data) near the distribution center of the known 50 circles, white squares, and black circles) in the pattern 

g^ixiple space and the inclusive relationship among them. That 

Step'l6 determine the sundard pattern which is clos- is. class CI contains subclasses Cll. C12, C13; likewise, 

est to each sample data point and then determines sam- class C3 contains subclasses C31, C32. C33; class C4 

pie data which belong to each subclass that has tiie contains subclasses C41, C42. C43. C44; and class C2 

standard pattern as representative data. 55 contains no subclasses. Standard patterns (sample facial 

Step 17 calculates an error E between each standaixi images) for each class are shown. The sample facial 

pattern and the sample data belonging to the subclass images have three categories-"human" (white circles), 

according to Equation 1 bdow. "dog" (white squares), and "cat" (black circles). Be- 

(Equation 1] cause the 256-dimensional space cannot be represented 

60 on a sheet of paper, it is only schematically shown for 

E=z„-i.A-((i/Ln)i|„i.L;,{2/«i.2(„x,-,xf)2}} dimension x, y, z in a three axis image space. 

FIG. 10 illustrates the tree structure of subclasses and 

where the sample data belonging to each subclass. That is, 

(iPCi, nX2): standard pattern value for subclass n; n«l subclass Cll belonging to class CI represents two faces 

to N. 65 of a cat2, subclass C12 represents one face of a dogi. 

N: number of subclasses contained in a parent class subclass C13 represents one face of a catj, class C2 

(or subclass). represents two faces of a cati, subclass C31 represents 

Ln: number of sample data belonging to subclass n. two faces of a dog2, subclass C32 represents two faces 
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of a dog3, subclass C33 represents two faces of a hu- 
man i, subclass C41 represents two faces of human:, 
subclass C42 represents one face of dog4, subclass C43 
represents one face of humanj, and subclass C44 repre- 
sents four faces of human^. 5 

Using the learned tree structure of FIG. 10 and the 
respective standard patterns shown in FIG. 10, it is 
checked which subclass the unknown facial image data 
belongs to» making it possible to correctly identify the 
category of the facial image data at high speed and with 10 
high precision. 

The configuration of the system of the invention will 
be described by referring to FIG. 12. 

Image sample data for learning are read in from a 
scanner 1 and stored in an image memory unii of mem- 15 
cry 4. The learning unit 2 generates standard patterns in 
a tree structure of FIG. 3 according to the learning 
procedure of FIG. 4, by using the image sample data 
stored in memory 4. The standard patterns generated 
are stored in a standard pattern memory unit of memory 20 
4. 

Then, image data of unknown category are taken in 
from the scanner 1. The decision unit 3 compares the 
image data of unknown category with the standard 
patterns stored in the standard pattern memory unit 25 
according to the input data category identification pro- 
cedure of FIG. 2 to determine its category. The CPU 5 
controls the overall operation sequence, i.e., input of 
image and operation of the learning unit 2, decision unit 
3 and memory unit 4. 30 

As explained above, this invention converts a so- 
called nonlinear separation problem, in which sample 
data contain plural categories, into a plurality of partial 
linear problems by dividing data into subclasses each of 
which is linear by having only data of one category. 35 
Hence, the invention can be applied to nonlinear separa- 
tion problems and can also perform similarity calcula- 
tions at high speed. 

The invention uses sample data and generates a tree 
structure for data classification in a learning mode, and, 40 
according to the tree structure, a decision is made about 
input data in a recognition mode, making it unnecessary 
to compare the unknown input data with all the stan- 
dard patterns, which in turn speeds up the processing. 
The similarity of the input unknown data with the stan- 45 
dard patterns can be determined by calculating the 
distance between them. By checking the similarity be- 
tween the standard patterns and the input unknown data 
through the distance between them, unexpected data of 
unknown category can be rejected, improving the accu- 50 
racy of the recognition decision. 

While the preferred embodiment has been set forth 
along with modifications and variations to show spe- 
cific advantageous details of the present invention, fur- 
ther embodiments, modifications and variations are 55 
contemplated within the broader aspects of the present 
invention, all as set forth by the spirit and scope of the 
following claims. 

What is claimed: 

1. An automatic pattern recognition method to deter- 60 
mine a category of input data points of unknown cate- 
gory, comprising: 
performing a first learning step of dividing sample 

data points into classes, which includes 
generating sample data points whose categories are 65 
known, 

generating a plurality of standard patterns in an n- 
dimensional space at arbitrary positions near a cen- 



ter of sample data points distribution, which stan- 
dard patterns correspond to an arbitrary plurality 
of classes of the sample data points, 

calculating the distances between individual sample 
data points and each of the standard patterns to 
determine the nearest standard pattern for each 
sample data point, 

temporarily classifying each sample data point as 
belonging to the class corresponding to the nearest 
standard pattern, 

calculating the summation of differences between 
each standard pattern and the corresponding sam- 
ple data points for each dimension for each class, 

moving the standard patterns in a direction repre- 
sented by the summation of differences for each 
class, 

temporarily classifying each sample data point as 
belonging to the class corresponding to the nearest 
moved standard pattern, 

recalculating the distances between each sample data 
point and each moved standard pattern to deter- 
mine the nearest standard pattern for each sample 
data point, 

moving the standard patterns in a direction repre- 
sented by the summation of differences, 

repeating the preceding three steps until the summa- 
tion of differences between each standard pattern 
and the corresponding sample data points is smaller 
in each dimension than a set specific value, and 

determining a final position of each standard pattern 
and sample data points belonging to the class repre- 
sented by the corresponding standard pattern; 

performing a second learning step, when the sample 
data points belonging to one class do not belong to 
the same category, which includes 

dividing the one class into a plurality of subclasses, 
and 

repeating said step of dividmg for the sample data 
points for each remaining class and subclass that 
has sample data points of more than one category; 

performing a third learning step of relating the stan- 
dard patterns, classes and subclasses obtained in the 
first learning step and second learning step to each 
other in a tree-structure representation and storing 
the tree-structure representation in memory; 

inputting data points of unknown category; and 

determining recognition/nonrecognition of the input 
data points of unknown category based on corre- 
spondence/lack of correspondence between the 
input data points and the stored tree-structure rep- 
resentation. 

2. The method of claim 1, wherein the step of deter- 
mining recognition/nonrecognition includes the steps 
of: 

identifying the classes in the tree-structure to which 
the input data points of unknown category are 
nearest in distance by checking the distance of the 
input data points to the standard pattern of each 
class; 

when the identified class has subclasses in the tree- 
structure, identifying the subclasses to which the 
input data points are nearest in distance by check- 
ing the distance to the standard pattern of each 
subclass; 

repeating this last identifying step with the identified 
subclass until a lowest-level subclass is reached; 
and 
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identifying the category of the lowest-level subclass 
to which the input data points closest as the cate- 
gory of the input data points. 

3. A method as claimed in claim 2, wherein when the 
distances between the input data points unknown cate- ^ 
gory and the standard patterns of classes or subclasses in 
any one of said identifying steps are greater than a pre- 
determined value, discarding the input data points as 
having an unknown category dissimilar to the catego- ]q 
ries of the sample data points. 

4. An automatic pattern recognition method to deter- 
mine a category of input data points of unknown cate- 
gory, comprising: 

performing a first learning step of dividing sample 15 

dau into classes, which includes 
generating sample data points whose categories are 

known, 

generating a plurality of standard patterns in an n- 
dimensional space at arbitrary positions near a cen- 
ter of sample data points distribution, which stan- 
dard patterns correspond to an arbitrary plurality 
of classes of the sample data points, 

calculating the distances between individual sample 25 
data points and each of the standard patterns to 
determine the nearest standard pattern for each 
sample data point, 

temporarily classifying each sample diata point as 
belonging to the class corresponding to the nearest 30 
standard pattern, 

adjusting the positions of the standard patterns if the 
difference between each standard pattern and cen- 
ter of distribution of the corresponding sample data 
points is not smaller in each dimension than a set 
specific value; 

repeating the preceding three steps until the differ- 
ence between each standard pattern and center of 
distribution of the corresponding sample data ^ 
points is smaller in each dimension than a set spe- 
cific value, and 

determining a final position of each standard pattern 
and sample data points belonging to the class repre- 
sented by the corresponding standard pattern; 45 



performing a second learning step, when the sample 
data points belonging to one class do not belong to 
the same category, which includes 

dividing the one class into a plurality of subclasses, 
and 

repeating said step of dividing for the sample data 
points of each remaining class and subclass that has 
sample data points of more than one category; 

performing a third learning step of relating the Stan- 
dard patterns, classes and subclasses obtained in the 
first learning step and second learning step to each 
other in a tree-structure representation and storing 
the tree-structure representation in memory; 

inputting data points of unknown category; and 

determining recognition/nonrecognition of the input 
data points of unknown category based on corre- 
spondence/lack of correspondence between the 
input data points and the stored tree-structure rep- 
resentation. 

5. The method of claim 4, wherein the step of deter- 
mining recognition/nonrecognition includes steps of: 

identifying the classes int he tree-structure to which 
the input data points of unknown category are 
nearest in distance by checking the distance of the 
input data points to the standard pattern of each 
class; 

when the identified dass has subclasses in the tree- 
structure, identifying the subclasses to which the 
input data points are nearest in distance by check- 
ing the distance to the standard pattern of each 
subclass; 

repeating this last identifying step with the identified 
subclass until a lowest-level subclass is reached; 
and 

identifying the category of the lowest-level subclass 
to which the input data points are closest as the 
category of the input data points. 

6. The method of claim 4, wherein when the distances 
between the input data points of unknown category and 
the standard patterns of classes or subclasses in any one 
of said identifying steps are greater than a predeter- 
mined value, discarding the input data points as having 
an unknown category dissimilar to the categories of the 
sample data points. 
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